Without a proper observation of the energy demandof the receiving terminals, the retailer may be obliged to purchaseadditional energy from the real-time market and may take therisk of losing profit. This paper proposes two combinatorial multiarmedbandit (CMAB) strategies in green cloud radio accessnetwork (C-RAN) with simultaneous wireless information andpower transfer under the assumption that no initial knowledgeof forthcoming energy demand and renewable energy supplyare known to the central processor. The aim of the proposedstrategies is to find the set of optimal sizes of the energy packagesto be purchased from the day-ahead market by observing theinstantaneous energy demand and learning from the behaviour ofcooperative energy trading, so that the total cost of the retailer canbe minimized. Two novel iterative algorithms, namely, ForCMABenergy trading and RevCMAB energy trading are introduced tosearch for the optimal set of energy packages in ascending anddescending order of package sizes, respectively. Simulation resultsindicate that CMAB approach in our proposed strategies offersthe significant advantage in terms of reducing overall energy costof the retailer, as compared to other schemes without learningbasedoptimization.
展开▼